A Non-parametric Maximum Entropy Clustering

نویسندگان

  • Hideitsu Hino
  • Noboru Murata
چکیده

Clustering is a fundamental tool for exploratory data analysis. Information theoretic clustering is based on the optimization of information theoretic quantities such as entropy and mutual information. Recently, since these quantities can be estimated in non-parametric manner, non-parametric information theoretic clustering gains much attention. Assuming the dataset is sampled from a certain cluster, and assigning different sampling weights depending on the clusters, the cluster conditional information theoretic quantities are estimated. In this paper, a simple clustering algorithm is proposed based on the principle of maximum entropy. The algorithm is experimentally shown to be comparable to or outperform conventional non-parametric clustering methods.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Maximum Probability and Relative Entropy Maximization. Bayesian Maximum Probability and Empirical Likelihood

Works, briefly surveyed here, are concerned with two basic methods: Maximum Probability and Bayesian Maximum Probability; as well as with their asymptotic instances: Relative Entropy Maximization and Maximum Non-parametric Likelihood. Parametric and empirical extensions of the latter methods – Empirical Maximum Maximum Entropy and Empirical Likelihood – are also mentioned. The methods are viewe...

متن کامل

Empirical Maximum Entropy Methods

A method, which we suggest to call the Empirical Maximum Entropy method, is implicitly present at Maximum Entropy Empirical Likelihood method, as its special, non-parametric case. From this vantage point the entropy-based empirical approach to estimation is surveyed.

متن کامل

Most Likely Maximum Entropy for Population Analysis with Region-Censored Data

The paper proposes a new non-parametric density estimator from region-censored observations with application in the context of population studies, where standard maximum likelihood is affected by over-fitting and non-uniqueness problems. It is a maximum entropy estimator that satisfies a set of constraints imposing a close fit to the empirical distributions associated with the set of censoring ...

متن کامل

Proposing a Novel Cost Sensitive Imbalanced Classification Method based on Hybrid of New Fuzzy Cost Assigning Approaches, Fuzzy Clustering and Evolutionary Algorithms

In this paper, a new hybrid methodology is introduced to design a cost-sensitive fuzzy rule-based classification system. A novel cost metric is proposed based on the combination of three different concepts: Entropy, Gini index and DKM criterion. In order to calculate the effective cost of patterns, a hybrid of fuzzy c-means clustering and particle swarm optimization algorithm is utilized. This ...

متن کامل

Determination of height of urban buildings based on non-parametric estimation of signal spectrum in SAR data tomography

Nowadays, the TomoSAR technique has been able to overcome the limitations of radar interferometry techniques in separating multiple scatterers of pixels. By extending the principles of virtual aperture in the elevation direction, these techniques pay much attention in the analysis of urban challenging areas. Despite the expectation of interference of the distribution of buildings with different...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014